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I. INTRODUCTION 


A. MOTIVATION FOR THIS THESIS 

As computing systems perform more and more sophisticated functions, 
the software components of such systems necessarily become larger and 
more complex. In addition, a plethora of critical systems, such as nuclear 
power plants and aircraft control systems are controlled by software. The 
growth in size and complexity of software results in an even faster growth 
in the complexity of software testing. Testing is currently a very labor 
intensive and expensive process that accounts for approximately 50% of 
software system development (Myers 1979, Korel 1990). 

The need for reliable software operation is increasing rapidly. This 
implies that extensive software testing is frequently necessary despite 
expenses. This thesis deals with software testing, and is intended as a sequel 
to previous research on failure region identification (Shimeall et.al 1991) and 


failure region clustering (Ginn, 1991). 


B. OUTLINE OF THE PROBLEM 

Failure regions appear to have a tendency to cluster so the neighbo- 
rhood of a specific failure region may reveal more software faults than the 
one that caused the region. An empirical study (Ginn 1991) on data obtained 
by Shimeall and Leveson (1991) has demonstrated this clustering tendency 
of failure regions using appropriate clustering criteria for multidimensional 
nominal types of data (Jain and Dubes, 1988). A number of issues have been 


raised from this research: 


¢ Several failure regions demonstrated a strong clustering tendency in one 
dimension but weak clustering in other dimensions. However it is not 
known if this is a coincidence or normal behavior of failure regions. 


¢ Software faults were numbered in order as they were discovered, by the 
various testing techniques applied by Shimeall and Leveson (1991), so 
that many of the sequentially numbered faults were discovered by the 
same detection technique. There was also a strong tendency of 
clustering for sequential faults, but it is as yet unknown if there is a 
correlation between certain detection techniques and certain types of 
fault clusters. 


¢ It is not clear which types of conditions and variables are more likely 
to result in clusters. 


There is empirical evidence that known failure regions may be used to 
understand the relationship of one fault to another. Failure regions offer a 
mechanism for identifying common features among faults, because the 
relationship between two failure regions corresponds to the relationship of 
the code locations of the associated faults. The goal of this research will be 
an improved testing technique that incorporates failure region behavior. To 
do this, we need to better understand how parts of the program interact, 
since faults with similarities in their failure regions are expected to occur 


under similar conditions. 


C. OVERVIEW OF THE THESIS 

Chapter II gives an extensive review of preceding research on software 
testing in general and failure region analysis in particular. Chapter III 
introduces a testing technique on clustering and presents empirical results 
from the application of the said technique on the results of a software 
experiment. Finally, Chapter 1V summarizes the conclusions that can be 


drawn from the results and offers suggestions and recommendations for 


further research. 
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fi. BACKGROUND AND RELATED WORK 

This chapter reviews software testing definitions and methods for 
dealing with software fault and failure association. Symbols and terminology 
are similar to those used in the classical papers (such as Weyuker and 
Ostrand, 1980, Goodenough and Gerhart, 1975). A brief overview of theories 
of software testing related to our research 1s presented. Previous work, both 
empirical and theoretical, in the area of failure regions analysis is included. 
Finally the cluster analysis technique that has been used by Ginn (1991) is 


briefly presented. 


A. SOFTWARE TESTING 


1. Basic Definitions 
Testing is a method of program verification that deduces from 
execution that a program possesses required properties (Morell 1990). 
The input data for the majority of programs come from a multi- 
dimensional space. An example given by Amman and Knight (1988) refers 
to a program processing an input of 20 floating point numbers thus having 


20 dimensions. 


We shall use D for the input domain of a program F, and R for the 
output range of F. On input d (deD) F, if it terminates, produces output 
F(d) (€ R). Hence the program, may be viewed as a mapping from the input 
domain to the output range. 

The output specification for F is given by OUT(x,y), where xeD 
and yeR. F is correct on input d (abbreviated OK(d)) if F(d) exists and 
OUT(d,F(d)). A test T (c D) 1s formally defined as a subset of the input 
domain (after Weyuker and Ostrand, 1980, Goodenough and Gerhart, 1975). 
More often than not, software failures appear at what seems to be an obscure 
set of data or a special case (Ammann and Knight 1988). This introduces an 


additional complication to the testing process. 


2. Ideal, Valid, and Reliable Tests 
An ideal test must be valid, which implies that for every fault in 
program F exist entries in test T that cause the fault to execute and produce 


incorrect output (Goodenough and Gerhart 1975). Formally: 


Ideal(T) -(V Fault € F: 3d€T \ ~OK(d)) (2.1) 


A reliable test is one that either produces entirely correct output or 


entirely incorrect. Formally: 


Reliable(T) -[(Wd €T: OK(d)) V(Vdeé T: ~OK(d))] (2.2) 


The importance of a failure region clustering theory becomes apparent 
here from the point of view of test selection. It was shown by Weyuker and 
Ostrand, (1980) that an ideal and reliable test is exhaustive therefore it is 
desirable, for the sake of feasibility, to select a much smaller test to the end 
of revealing certain types of faults. A failure region clustering theory 


provides same guidelines for this type of selection. 


B. PROGRAM PATHS 


1. The Concept of a Program Path 

A path in a program 1s defined as a sequence of statements (Rich- 
ardson and Clarke 1985). It is quite obvious that a path may be decomposed 
into a number of subsidiary paths. A block statement is decomposed into its 
constituent enclosed statements whose execution depends upon evaluation of 
a condition which 1s also part of the block statement. 

The control flow statements in a computer program partition the 
input space into a set of mutually exclusive domains each of which causes 
a corresponding path to be executed (White and Perera 1986). This concept 


of path offers a natural way to partition the input domain. 





The subdomain D, of path j, is defined by a boolean expression 
(say P(j)) that is the conjunction of the path's branch predicate constraints. 
In general these predicates are expressed in terms of both local (program) 
and input variables. However it is possible to replace each program variable 
appearing in the predicates by its symbolic value defined in terms of input 
variables along that path and get an equivalent constraint that 1s the partition 
boundary, P(j), as a function of mput variables only (also predicate 


interpretation after White and Wiszienwski 1988). 


2. Programs as Sets of Partial Functions 

Using the concept of path, as defined in the previous subsection, 
we can model a program as a set of partial functions from the input 
partitions (D,) to the output space (QO), each of the partial functions 
corresponding to the execution of a sequence of statements along the 

corresponding path (j) (Richardson and Clarke 1985). Formally: 
S5.15}2--S)n(D)) = O'cO (2.3) 
where S, |S, ,..S,,, stands for the sequence of statements’ along path j. In the 


case of loops executing along the path (let S,S,,,,..S..,,, be the sub sequence 


j,i+m 


The condition evaluations are included into the sequence of statements for 


consistency although, since the path is pre-determined they do not affect 
the result of the operation. 


of statements in the loop body) the path sequence of statements may be 
written S;1S;2( S))S,441--Sjum) «Sj, Phis model assigns sequences that differ 
only by their number of loop iterations to the same program path. 

As a path may be decomposed into other paths, an input domain 
partition may be refined by the same token in a hierarchical way, as new 
branch predicate conditions are "AND ed" to the existing. This suggests a 
more generic approach to the issue of input domain partition than the one 
suggested by Richardson and Clarke (1985) where they distinguish this 
partition into implementation and _ specification partitions and point out 
certain discrepancies due to the inherent differences of the specification 


versus the implementation languages. 


3. Regular Expressions for Paths 
The set of paths, on a flow-chart, can be expressed in algebraic 
form (Beizer 1990). Path expressions are converted to regular expressions 
that can be used to examine structural properties of program paths. 
For any single path, j, of the program, the sequence of statements 
S,15;2--S;, Introduced in the previous section is the path product of j. Path 
products are also defined on path segments. The path that consists of 


successive path segments, has a path product equal to the concatenation of 





their path products. A set of parallel paths between two nodes has a path 
product equal to the sum of the path products of the parallel paths. 
Condition evaluations are retained in the path products for 
consistency (cf. previous subsection). In fact, the path products of the two 
segments starting from a decision point (such as an if C then else construct 
or a loop exit condition etc) are preceded with the condition C and condition 
—C respectively, depending upon which part of the decision (if or else) 
results in their execution. The regular expressions for paths provide a concise 
and compact notation for both the path conditions (which, for a given set of 
paths, are derived by conjunction of all conditions on successive path 
segments and disjunction of conditions on parallel path segments) and path 
actions (which, in the previous section, are modelled as partial functions 


from the input domain to the output range). 


C. ON ERRORS AND FAULTS 
A fault is an erroneous piece of program source code, while an error 
is a discrepancy between a computed value and the true, specified, or 


theoretically correct value. 


A failure specifies the inability of a module to perform its specified 
function and includes both erroneous output and failure to produce output 
(see ANSI-IEEE STD 610.12-1990). 

A fault (E) is formally modeled as a 3-tuple (Shimeall et_al 1991): 

EK = <L,V,C> (2.4) 
where L, is the location of the fault” (which is some program statement), V, 
the list of variables that form the error caused by the fault and C, is a 
Boolean condition under which the fault is activated. An interesting point is 
that a fault activation does not always imply a failure (coincidental 
correctness, Morel 1988). There are three conditions that must hold true for 
a fault to produce a failure (Shimeall et_al 1991) which, apart from C, are 
the reachability condition and the error propagation condition. 

A basic assumption about the faults in code is the competent program- 
mer hypothesis (DeMillo et_al 1978). This states that a competent program- 
mer will write a program that is syntactically close to the correct program. 

A taxonomy of the types of faults possibly found in a program is given 
by Beiser (1990): 

Sometimes a fault is "distributed" to more than one locations while 


certain types of faults, such as missing functions etc., do not have a well 
defined location. 


1. Requirements and Specifications 
These include incomplete or self-contradictory specifications, 
missing, wrong or superfluous features, and not-well-specified feature 


interactions. 


2. Structural Errors 
Such as control and sequence errors, logic errors, incorrect 
formulae applications, use of uninitialized variables. These can be further 


divided (Richardson and Clarke 1985) into: 


a. Computation Errors 
A computation error occurs when the correct path through the 
program is taken, but the output is incorrect because of faults in the 


computation along the path. 


b. Domain Errors 
A domain error (White and Perera 1986) occurs when a specific 
input follows the wrong path due to an error in the control flow of the 


program. They are of two kinds: 


I] 


(1) Missing Path Errors: They occur when a special case 
requires a unique sequence of actions, but the program does not contain a 


corresponding path. 


(2) Path Selection Errors: They occur when the program 
recognizes the need for a path, but incorrectly determines the conditions 


under which the path 1s executed. 


3. Data Errs 
Occur when a specific input follows the correct path, but an error 
such as wrong data declarations, wrong data initialization (especially in 


shared dynamic objects), etc. results in erroneous output. 


4. Coding Enors 
The most common coding errors are documentation inconsistencies, 
typographical errors, and erroneous use of a program statement when its side 


effects are not well understood. 


5. Interface Errors 
Such as interface communication problems, incorrect input-output 
format, wrong subroutine control sequence, wrong call parameters, incon- 


sistent entry or exit parameter values, etc. 


|e 


D. ON FAILURE REGIONS 

A software failure region (GCD) is the set of all input values that are 
mapped by an individual program fault onto any failure (or onto a failure set, 
as in figure 1.1). It 1s noted that the concept of failure region includes both 
the input points which cause erroneous output and the geometry of it. These 
sets are always finite, since the number of representations in a finite machine 
is limited, but more often than not intractably large. The failure region boun- 
daries are defined by Boolean conditions on the input domain. 

Shimeall et_al (1991) demonstrate a technique that analytically de- 
termines the three conditions for a known fault (which include all conditions 
for reaching, activating and propagating the fault), by symbolically executing 
(King 1976) the source code on every "loop [0,1}" path (Loops are handled 
as in previous section, by applying either the exit condition or the loop 
effects) and conjuncting the obtained Boolean expressions. The conjunction 
of those conditions is the mathematical specification of the failure region 
boundary (which may be called Bound(G)) subject to the limitations of finite 


representation in computing machines. 





Figure 1.1 Associations between Failure regions Faults 
and Failures (adapted from Shimeall et al 
1991) 


E. BOUNDS CORRELATION BETWEEN DIFFERENT FAILURE 
REGIONS 
The dimensions of the input domain of a program can provide a set of 
criteria for the correlation between different failure regions. To this end a 
classification of these dimensions is used (Ginn 1991), with respect to any 


pair of failure regions: 





¢ The dimensions that appear in both boundaries in exactly the same way 
are termed identically participating dimensions. 


¢ Those who appear in both boundaries but not in an identical way are 
termed coincidentally participating dimensions. 


¢ The dimensions that do not appear in the boundaries of both regions are 
called nonbounding dimensions, because the boundaries these dim- 


ensions place on failure regions are no more restrictive than their entire 
range of values. 


In Ginn (1991) the Identical and Coincidental dimensions are collective- 


ly referred to as Composite dimensions. 
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lif. CLUSTER ANALYSIS AS A PREDICTOR FOR FAULTS 


A. CLUSTER ANALYSIS 

The identification of groups that have similar characteristics is the goal 
of clustering analysis. In the context of this thesis, the objective of failure 
region analysis is to investigate clustering of failure regions and identify the 
logical relation of the program locations of the faults responsible for these 
failure regions. 

By definition of failure region (Shimeall et_al 1991 for example) the 
execution of a program, with data from some failure region as input, will 
reach, activate, and propagate to the output the corresponding program fault. 
The conjunction of predicates that defines the bound of the region (Chapter 
II-D) contains a statement of the reachability, activation and propagation 
conditions for the corresponding fault. 

The criterion used by Ginn (1991) assumed that clustering between 
failure regions occurs when identical or similar predicates appear in the 
bounds of these failure regions. Therefore, one could use identical dim- 


ensions (corresponding to identical predicates) or composite dimensions 


(corresponding to both identical and similar predicates) to define a measure 
of clustering. The assumption made by Ginn (1991) is reasonable, because 
it is expected that when failure regions correspond to faults that share some 
program paths (therefore some of the predicates in common) it 1s possible, 
but not necessary, that the same input may reveal all of them. On the 
contrary, when the faults do not share any path, it 1s impossible for both of 
them to be revealed by the same input. 

The criteria used for the failure region bounds variables are essentially 
ordinal: identical, coincidental, nonbounding, in descending order. Therefore 
relative values cannot be assigned to them. There are, however, two different 
coefficients that may serve as a measure of clustering for ordinal data (see 


Jain and Dubes 1988): 


+ 
S(G,,G,) = %00 741 (3.1) 
a 


The simple matching coefficients for two failure regions G,, G, are 


defined by 3.1, where a,, stands for the non-bounding dimensions of both 
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failure regions a,, for the shared’ dimensions of Gi and Gj, (cf. Chapter II-E) 
and a for the total number of input dimensions. The similarity of failure 
regions G, and G, is greater the closer the simple matching coefficient is to 
unity, since S(G,,G,)=1, because a,,=a,,=0. The inclusion of a in the 
numerator of the coefficient emphasizes equally bounding and non-bounding 
dimensions and can result in very high (close to unity) values of the 
coefficient when the total number of input dimensions is much greater than 
the bounding dimensions of two failure regions. Suppose for example the 


pair of failure regions: 


: 
Bound(G, ) =(x<5)A(v<8)A(@< 10) (3.2) 


Bound(G,) = (w<0)A(u<0) (3.3) 
If the total number of dimensions is 250 then a,=245, a,,=0, and 
S(G,,G,)=245/250=0.98, which implies a high degree of clustering while it 
is obvious that the failure regions are not related. 


5 
uv 


The term shared may refer to dimensions appearing in identical predicates in two 
failure regions definitions, or may include both similar and identical predicates. Ginn | 
(1991) investigates both cases separately. 
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The Jaccard coefficients, defined by 3.4, seem more sensitive to 
clustering since they emphasize more the composite dimensions, as opposed 
to the simple matching coefficients that are symmetric in composite an non- 
coincidental dimensions. As with the simple matching coefficients, the 
similarity of failure regions G, and G, is greater the closer the Jacard 


coefficient is to unity, since KG,,G,)=1. 


ayy 


KG,G) = (3.4) 





a~ Ag 


However, under certain circumstances, the reliability of Jaccard 
coefficients is debatable. Suppose for example the pair of failure regions 


defined by (3.5), (3.6): 


Bound( G,) = (x<5)A(y<8) A(z<10) (3.5) 


Bound(G,) = (x<5)AQv<8)A@<10)A(w<0)A(u<0) (3.6) 
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In this case the Jacard coefficient KG,,G,)=3/(3+2+0)=0.6, while it is 
intuitively obvious that the relation is probably stronger than is suggested by 
the coefficient. On the other hand, the simple matching coefficient, 
depending upon the number of dimensions that are non-bounding for both G, 
and G, may vary between 0.6 and 0.9999. 

In the following sections, we shall examine how pairs of faults may 
result in clustered failure regions, as well as how the Jaccard coefficients are 


affected. 


B. RELATED FAULTS 

The goal in analyzing fault relations is to understand the clustered 
failure regions, which appear in software experiments (Ginn 1991). To this 
end, it is required to determine what kind of fault relationship results in 


clustered failure regions. 


1. Logical Relation of Faults 
Faults can be logically related if they are either the same logical 
flaw (Taxonomically related) or they are located in regions of programs that 


compute the same part of the application (Functionally Related). 
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2. Taxonomically Related Faults 

A logical flaw is an error, or misunderstanding by the programmer, 
in the design logic of a program that results in a number of faults. Taxo- 
nomically related faults are expected to be both application dependent and 
programmer dependent. They are application dependent because it 1s 
expected that applications requiring a great number of a certain type of 
constructs (say loops or if statements for example) are more prone to the 
type of faults peculiar to these constructs (loop iteration and control flow 
faults respectively for this example). It is not always clear from the 
specification of a program exactly how many such constructs will be 
required, and it is usually up to the programmer to decide for the im- 
plementation details. However, the higher level design of a program always 
gives an indication whether the implementation requires many iterations or 
a lot of case handling etc. Therefore this type of faults depends both on the 
design and the implementation details. 

On the other hand they are programmer dependent because every 
programmer has his/her own weak and strong points in developing a 


software design, and it is obvious that some phases of his/her work will be 
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more prone to faults than others. This also holds true for both the design 


and the implementation phases. 


3. Taxonomical Fault Classification of a Software Expenment 

The classification of the known program faults in a set of eight 
redundant versions of a combat simulation program (Shimeall 1991a, 1991b) 
constructed as part of a software experiment (Shimeall and Leveson 1991) 
were analyzed. The fault categories used in this taxonomy are from Beiser 
(1990). According to his classification scheme each fault 1s characterized by 
the type of logical flaw that produced it (for example case selection bug, 
control logic bug etc) and is assigned a four digit code number. 

The first digit 1s characteristic of the highest level of the taxonomy 
hierarchy (1.e., Structural Bug, has code 3xxx) while the last digit specifies 
the exact category of the fault (for example a Structural, control state fault, 
has code 3154). The advantages of this taxonomy its that provides an easy 
hierarchical and logical scheme, and the four digits of the code specify the 
four levels of the used hierarchy (cf. Chapter II section C for a discussion 
of the highest levels). 

The results of the fault classification in the eight versions are 


presented in Table 3.1 (detailed results including fault relative frequency 


Z2 





histograms for the eight versions, and for the observed to expected relative 
frequency ratio of the total number of faults are presented in Appendix B, 
Figures B.1 to B.8). In addition to the fault statistics, in the table we include 
the number of statements, if and case selection statements, and loop 


constructs. 


4. The "Taxonomical Clustering" Hypothesis 

The testing of the '"taxonomical clustering" hypothesis is 
performed by comparison of the actual results with a simple random model 
that assumes that, given the total number of faults in a program, the 
probability of an fault occurrence in any line is equal to the ratio of the total 
number of faults by the total number of lines. The, ratio, is the expected fault 
rate for the given program. The expected number of faults for each of the 
categories in Table 3.1, for the random fault distribution, is calculated as the 
product of the fault rate time the number of statements where this pe of 
fault can occur. 

Within the scope of our working model, processing, initialization 
and algorithmic faults (codes 321x, 322x, 323x) occur in all program 
statements apart from if, case and loop constructs. Loop and iteration faults 


(code 314x) occur in loop constructs. Control logic, case selection and 


control state faults (code 312x, 313x, 315x) may appear in if or case 
statements. The exception handling faults appear usually either in control (if, 
case) statements or loop exit conditions. Therefore we assume further that, 
on the average, half of the faults in loop and control statements are on 
exception handling and the other half on loop, iteration and control 
respectively. 

The random fault allocation that results from the use of our model, 
is also included in Table 3.1 for the eight program versions. In this case the 
"random" version number has the subscript r. 

In Appendix B we present the relative frequency distribution for 
both the actual number of faults and the model predictions. To test the 
goodness of fit of the observations to the random fault distribution model we 
make use of the Chi-Sguare (y’) One-Sample-Test (Siegel and Castellan 
1988). This test gives the level of significance (probability of occurrence in 
fact) that the y? statistic, which increases with the difference between the 
observed and the expected fault distribution, 1s greater than a certain value. 
The higher the probability of the said difference the more confident we are 
that the selected model corresponds to the actual distribution. However, the 


results of Table 3.1 imply that the probability of the observed differences 
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between the actual number of faults and the model prediction 1s below 
0.001, because for 3 degrees of freedom (since there are four fault cate- 
gories) the smallest value of y’ was 19.7 in version VI. Therefore we can 
conclude that the fault distribution is not uniform. This implies that the faults 
that have the most opportunity of being committed are not always the most 
frequent in a program. 

It is evident from the statistics in Table 3.1 that control logic, case 
selection and control state faults appear at a rate about fourteen times the 
average fault ratio in the program while processing and initialization fault 
rate is about 30% of the average. This leads to the conclusion that the 
handling of the control logic of the program, at least in the CONFLICT 
experiment (Shimeall 1991a, 1991b, Shimeall and Leveson 1991) and for the 
programmers selected, represents a task of much greater difficulty than 


processing and initialization in the same program. 
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TABLE 3.1 
(PROGRAM STATISTICS, FOR TAXONOMICAL CLUSTERING) 
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5. Taxonomical Clustenng of Failure Regions Compared to the 
Structural Clustering of Failure Regions 

In this section, we examine whether the logical clustering of failure 

regions correlates with the number of shared dimensions in their bounds. The 

number of shared dimensions at the bounds of two failure regions has been 

used in the definition of a clustering metric, the Jaccard coefficient between 

two regions, by Ginn (1991). On the other hand, we consider two failure 

regions as members of the same Taxonomical cluster if the associated faults 

are both results of similar logical flaws. We classify the Taxonomical 


clusters as follows: 


¢ Type A: Corresponds to loop and iteration faults (code 314x). 


¢ Type B: Corresponds to control logic, case selection and control state 
(codes 312x, 313x, 315x) faults. 


¢ Type C: Corresponds to processing, initialization and algorithmic faults 
(codes 321x, 322x, 323x). 


¢ Type D: Corresponds to faults in exception handling (code 316x). 


Summary statistics of this comparison are presented in Table 3.2 
(Details are included in Appendix D). From these results it 1s apparent that 


the appearance of shared dimensions, and the variation of the Jaccard 
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coefficient, on failure region bounds does not depend on the taxonomical 
clustering as analyzed in subsection b, where the clustering criterion between 
different failure regions has been whether they correspond to faults resulting 


from the same logical flaw. 


TABLE 3.2 
AVERAGE AND STANDARD DEVIATION OF THE JACCARD 
COEFFICIENT BETWEEN TAXONOMICALLY CORRELATED 
FAILURE REGIONS 


| VER- Failure Failure region | Failure raion | Failure region Uncorrelated 
SION as pair of type B | pair of type C | pair of type D | SU ™slon palr 
0.051+0.115 0.049+0.134 | 0.042+0.108 | 042+0.108 


. oar 0.076+0.135 | 0.054+0.107 Oe pees 0.055+0.115 
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In Table 3.3 we present the same statistics as in Table 3.2 with 
the difference that, in this case, we use the average non-zero Jaccard 


coefficient between taxonomically correlated failure region. This indicates 


In entries without Standard Deviation there is only one value. 


Zo 


whether the taxonomical clustering of failure regions correlates with the 
number of shared dimensions in their bounds, when these shared dimensions 


exist. 


TABLE 3.3 
AVERAGE AND STANDARD DEVIATION OF THE NON ZERO 
JACCARD COEFFICIENT BETWEEN TAXONOMICALLY 
CORRELATED FAILURE REGIONS 


VER- Failure Failure region Failure region | Failure region | tomas Failure region Uncorrelated 
EqE= 
0.  0.12440.150_ 150 0. | 0.24740.361 361 0. Mey 0.164+0. / 0,16440.159_ 
eS ee 
_m | = | ormisorss | orssiorse | | ossio.as 
JW | - | orsiooas | orssiore7 | 0077 | 013180136 _ 
vf fessor | on [= | onseoaes 
Te 
P vu [| oxsnora7 | o1sse0 109 | oa | orrssom0” 
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0.46940.222 0.283+0.117 0.387+40.258 


















The results in Table 3.3 provide us with evidence in favor of the 


intuitively obvious hypothesis that the structural clustering of failure regions 
does not depend on the taxonomical clustering. This result can be attributed 


to the fact that structural clustering of failure regions, depends strongly on 





the control and data flow structure of the program, with respect to the 
corresponding fault location. On the other hand the taxonomical clustering 


dependence is restricted to the type of statement at the fault location. 


6. The Case of Functionally Related Faults 
The hypothesis, that logically-related faults located in regions of 
programs that compute the same part of the application may lead to some 
type of clustering, is based on the intuitively obvious assumption that some 
parts of a problem may be more difficult to handle or more "error prone" 
than others (Brilliant et_al 1990). We shall call this type of faults functio- 


nally related. 


7. Functional Fault Classification of a Software Expenment 
The distribution of the known program faults in the eight versions 
of the combat simulation program (Shimeall 1991, 1991b) to the program 
modules implementing different functional requirements of the specification 
was analyzed. We classified the functional clusters, according to the 


CONFLICT Specification (Shimeall 1991b) as follows: 


¢ Type I. Positioning and Movement 


pl 


¢ Type H. Observation 

¢ Type If. Attntion 

¢ Type IV. Communication 
¢ Type V: Environment 

¢ Type VI. Restoration 


¢ Type O: Others (Includes the main procedure of CONFLICT and some 
procedures for initialization and output format) 


8. The Functional Clustering Hypothesis 

The null hypothesis that fault distribution is uniform over the 
program locations was used in this analysis to test the clustering hypothesis. 
The number of expected faults for each functional requirement of the 
specification was set equal to the total number of lines of routines im- 
plementing the requirement times the fault rate (as in subsection 1b) for the 
program. 

The results of this analysis are presented in Table 3.4., while 
detailed analysis is included in Appendix D. 

To test the goodness of fit of the observations to the random fault 


distribution model we make use of the Chi-Square (y’) One-Sample-Test 
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(Siegel and Castellan 1988), as in subsection 4. In this case there are seven 
categories of faults, therefore the degrees of freedom are six. 

Apart from versions IV, VII and VIII, the confidence level for the 
null hypothesis 1s varying from 1% to 31%. In versions IV, VII and VIII this 
confidence drops below 0.1%. This result, however, may be attributed to the 
low (less than five) expected number of type I, II, III, V, VI and O faults for 
version IV and the low expected number of type I, V, VI faults for version 
VII. In version VII the result may be attributed to the low, 3.1, number of 
expected fault at column IV compared to the 10.5 observed. 

The above discussion can be verified in case we consider only two 
fault categories for version IV, Type IV with 7 observed and 6.7 expected 
faults, and all others with 16 observed and 16.3 expected faults. This will 
result in a y* equal to 0.019 and a corresponding confidence level, one 
degree of freedom this time, greater than 90%. 

In the case of version VIII, we may consider the fault categories 
II, III, IV, O and all others, with four degrees of freedom, and the y? will be 
equal to 14.85, which gives a confidence level of 1%. 

The confidence level increases accordingly with all the remaining 


versions if we group together all categories with expectation value less than 


5 as suggested by Siegel and Castellan (1988). Under the circumstances, we 
cannot either accept or reject the null hypothesis of uniform distribution of 
faults. 

An alternate hypothesis, that the faults are uniformly distributed 
between type IV (Communication), type III+II (Attrition and Observation) 
and all other types is tested in Table 3.5. The alternate hypothesis assumes 
that some type of functional clustering exists, because 2/3 of the faults are 
clustered in the Communication, Attrition and Observation modules, which, 


on the average, constitute the 50% of the total lines of code. 


TABLE 3.4 
(PROGRAM STATISTICS, FOR FUNCTIONAL CLUSTERING) 
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TABLE 3.4 (PROGRAM ees FOR FUNCTIONAL CLUSTER- 
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TABLE 3.5 (ALTERNATE HYPOTHESIS TEST) 


Version San IP aype tv Te IV Type I i+ Te TOTAL Confidence 
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TABLE 3.5 (ALTERNATE HYPOTHESIS TEST) 


i Version Type IV type IV | Type nett | Type | 7 H+HI Type TOTAL | Confidence ever’ 
| — =o 

| v0 

iv = = = 


13.67 13.67 | 13.67 


oc 
Tome f «| = |= [ww] - 
Grom | es [es | es [ew] 
es 


The confidence levels of the chi-square test range from 84.3% to 








0.6%. Although we cannot accept or reject the alternate hypothesis, we 
notice that the confidence levels for acceptance are in general higher that 
those of the null hypothesis. The results for versions III, VI, VII and VIII are 
in favor of the alternate hypothesis, which implies some type of functional 
clustering. The results for versions II, IV, V are not in favor of the alternate 
hypothesis because the functional clustering tendency is stronger than the 
assumed by the alternate hypothesis, and only version I implies that there is 
not any clustering tendency favoring Attrition Communication and Observa- 


tion. 


Si] 


Therefore it 1s reasonable to conclude from the data at hand that there is 
indeed some, not very strong, tendency of the failure regions to cluster in 


some groups of functional program modules more than in others. 


9. Clustenng of Failure Regions of Functionally Related Faults Com- 
pared to the Structural Clustering of Failure Regions 

In this section, we examine whether the functional clustering of 
failure regions correlates with the number of shared dimensions in their 
bounds. In this case we consider two failure regions as members of the 
same functional cluster if the associated faults appear in regions of the 
program that compute the same part of the application. Summary statistics 
of this comparison are presented in Table 3.6 and 3.7 (the latter corresponds 
to the non zero Jaccard coefficient case) while details are included in 
Appendix D. 

From the results of Table 3.6 it is not possible to conclude that the 
appearance of shared dimensions, and the variation of the Jaccard coefficient, 
on failure region bounds does or does not depend on the functional 
clustering, as defined in subsection 8, because the calculated standard 


deviations exceed the averages due to the abundance of zero data values. 
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However, from Table 3.7, we can see that the average non zero 
Jaccard coefficient is systematically higher for functionally correlated pairs 
that for uncorrelated. This implies that, given the structural clustering of two 
failure regions, the clustering metric is higher for functionally correlated 
ones. This result can be attributed to the fact that structural clustering of 
failure regions depends on the control and data flow structure of the 
program, with respect to the corresponding fault location. 

It is expected that in a reasonably well structured program, faults 
on the same group of functional modules will share a common control and 


data flow path segment more often than faults on different groups. 
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TABLE 3.6 
AVERAGE AND STANDARD DEVIATION OF THE JACCARD 
COEFFICIENT BETWEEN FUNCTIONALLY CORRELATED 
FAILURE REGIONS® 


VERSION 


0.025 | 0.140 | 0.042 | 0.240 | 0.125 | 0.096 | 0.0284 | 
+0.04 | +0.006 +0.49 | +0.21 | +0.14 | 40081 
0.091 0.076 | 0.111 0.0181 
+0.24 +0.184 | +0.192 +0.073 
0250 | o+0 | 0.223 | 0.097 | 0.148 0.096 | 0.046 
+0.312 | +0.137 | +0.091 +0.14 | +0.095 





IV 0.0374 | 0.056 | 0.0823 0.019 : 
+0.064 | +0.096 | +0.235 +0.053 | 
Vv 0+0 | 0.714 | 0.0227 0.005 | 0.0087 : 
+0.086 +0.03 | +0.051 | 
0336 | 0114) |ou07s 0.0643 | 0.0184 : 

+0.043 | +0.220 | +0.127 +0.16 | +0.071 


0.117 | O10ie2 
+0.19 | +0.071 


+0. +0. +0. 
Vill 0.125 0.333 O75 0.102 0.187 0.069 
+0125 +0.315 +0.206 | +0.285 +0.180 | 





: In entries without Standard Deviation there is only one value. 
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TABLE 3.7 
AVERAGE AND STANDARD DEVIATION OF THE NON ZERO 
JACCARD COEFFICIENT BETWEEN FUNCTIONALLY 
CORRELATED FAILURE REGIONS’ 


0350+ | 0.430 
02 + ate 
0.110 
0.636 0.4074 | 0.333 0.206+ 
0.216 0.152 
0.250 86+ 244+ 148+ 0.240 | 0.145+ 
; als 0.116 
0.149 


IV 0.127+ |} 0.167 0.70 Jey 
0.053 0.080 
Vv 248+ 0.170 
16 zi 
0.029 
0.336+ | 0.1904 | 0.186+ 0.50 0.100+ 
0.043 0.265 0.156 0.070 
0.2922 | 0.4764 | 0.333 0. oe 0.196+ 
+0.239 | 0.088 0.130 
_9.0 096. 





In entries without Standard Deviation there is only one value. Crossed out 
entries, contain Average and Standard deviation of only two data values 
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C. CONCLUSIONS ON CHAPTER Ii] 

In this chapter the failure region analysis on the results of the 
CONFLICT experiment identifies the logical relation of the faults responsible 
for the observed failure regions. 

The testing of the taxonomical clustering hypothesis, section 3.3 to 3.5, 
implies the observed fault distribution 1s not uniform. The control logic, case 
selection, and control state faults appear at a rate about fourteen times the 
average fault ratio in the program while processing and initialization fault 
rate 1s about 30% of the average. Therefore control logic faults and 
corresponding failure regions actually exhibit taxonomical clustering 
behavior. This, however, does not correlate with the structural clustering of 
failure regions observed by Ginn (1991) on the same set of data. This can 
be justified by the fact that the latter depends strongly on the control and 
data flow structure of the program, with respect to the corresponding fault 
location, while the former depends on the type of statement at the fault 
location. 

The results of subsections 3.6 to 3.8 show that about 2/3 of the faults, 


on the average, appear in the functional modules of Attrition, Communica- 
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tion, and Observation which include approximately 50% of the program 
lines. This implies some mild tendency of the failure regions for functional 
clustering, since some functional modules are indeed more fault-prone than 
others. 

From subsection 3.9 we can see, comparing our results with Ginn's 
(1991), that for all pairs of structurally correlated failure regions, the 
clustering metric is, on the average, higher for functionally correlated ones. 
This is justified by fact that, in a reasonably well structured program, faults 
on the same group of functional modules will share a common control and 
data flow path segment more often than faults on different groups, therefore 


functional and structural correlation cannot be independent. 


IV. CONCLUSIONS AND SUGGESTIONS FOR FURTHER RE- 
SEARCH 

It has been conjectured in the past that fault occurrences tend to 
converge on program locations, which implies that the revealing of a fault 
might indicate the existence of others in close proximity of location. 
However, the evidence so far has been mostly anecdotal. This thesis, together 
with Ginn's (1991) have been of the first to analyze the relationships between 
specific faults using structural (Ginn 1991), taxonomical and functional (this 
thesis) criteria. The results of both support the hypothesis of fault clustering 
and suggest methods for the exploitation of them in software testing. This 
chapter summarizes these results, in conjunction with previous work, and 


points towards the research questions which are open to further investigation. 


A. CONCLUSIONS 
This thesis, being a sequel to previous work by Ginn (1990), offers 
strong evidence that failure regions tend to form clusters, not only when 


structural criteria are used, but taxonomical and functional as well. 
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The clustering criteria used in this thesis were imposed externally, since 
both the fault taxonomy and the functional classification of the faults have 
been independent from the experimental data structure. Therefore, the 
observed clustering tendency of failure regions can be characterized as global 
as opposed to the local clustering tendency explored b Ginn (1990). In local 
clustering the criteria are strongly dependent upon the structure of the data 
at hand (Jain and Dubes 1988). Therefore the Jaccard coefficients used by 
Ginn readily fall into this category. 

The taxonomical clustering behavior suggests that parts of the program 
prone to control logic, case selection and control state faults, must be the 
focus of the testing effort. This implies that these parts of a program must 
also be thoroughly and extensively documented in order to facilitate this 
focus of effort. The CONFLICT experiment (Shimeall 199]a, b) data 
analysis suggests that decision points and program control flow have a higher 
probability of fault occurrence than other locations of the code. A good 
testing or documentation method, such as decision tables etc., is expected to 
reduce substantially the rate of this type of faults. 

The usefulness of the functional clustering behavior is that known faults 


in a program imply that the probability of more, undiscovered, faults in the 
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functional module is higher than the in rest of the code. This provides expe- 
rimental evidence in support of the anecdotal conjecture that faults tend to 
attract other faults (Myers 1979). It also suggests that the distribution of 
faults during the first tests indicates the most fault-prone functional modules, 
which should be singled out for additional testing. 

More often than not the testing effort is exponential, or of higher 
complexity, in the length of code. The ability to point out the most fault- 
prone modules or constructs, under the experimentally-verified assumption 
of functional and taxonomical clustering, represents a substantial reduction 
to the required amount of testing. 

The nature of the cluster formation, and the correlation to Gunns 
structural clustering, for the two criteria used was markedly different. The 
taxonomical classification tended to demonstrate a clustering of type C faults 
(and failure regions) in numbers one order of magnitude higher than the 
expected when a uniform fault per line of code distribution was assumed. 

In order to compare the taxonomical clustering of failure regions to the 
structural clustering of the same regions, we calculated both the average 
Jaccard coefficient and the average non-zero Jaccard coefficient (using Ginn's 


results) for every type of possible taxonomical clustering and every version 
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of the program. The results in Tables 3.2 and 3.3, in Chapter III, suggest that 
the structural clustering of failure regions does not depend on taxonomical 
clustering. This is attributed to the fact that structural clustering depends 
strongly on the control and data flow structure of a program with respect to the 
corresponding fault locations while the taxonomical clustering depends mainly on 
the type of program statement at the said locations. 

The functional criterion revealed a tendency of faults to concentrate on 
certain functional groups of modules (In the CONFLICT case in the Communication, 
Observation and Attrition groups). A quite interesting result, in this analysis, 
has been the small scale clustering exhibited by the faults, which tended to occur 
in high numbers within certain procedures (93) while the majority of the examined 
procedures (446 total, in all eight versions of CONFLICT) were faultless (cf. 
Appendix D for a detailed functional fault type distribution). Similar behavior 
of faults has been reported by Myers (1979) but no explanation was cited. The 
small-scale clustering suggests that some procedures are more complicated, 
therefore more fault-prone, than others. In the eight versions of the CONFLICT 
experiment, the average procedure was 30+20 lines of code in length while the 
average length of procedures with at least one fault has been 42.5+33 lines and 


with two or more faults 55+45 lines. This result provides evidence in support of 
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the above-mentioned argument about procedure complexity and fault 
clustering correlation, despite the inaccuracies introduced by the fact that 
only faults with a well-defined location were considered. 

Unlike the taxonomical-to-structural clustering lack of correlation, in 
testing the functional-to-structural clustering correlation in the same as above 
way, it was found that the average non-zero Jaccard coefficient 1s systemati- 
cally higher for functionally correlated pairs than for uncorrelated. This mild 
correlation is explained by the fact that in a reasonably well-structured 
program, faults of the same group of functional modules will share a 
common control and data flow path, will be structurally correlated, more 
often than faults on different groups. 

The use of the number of lines of code (excluding comments but not 
variable declarations) in our analysis instead of the Halstead length is due 


to the ease of use of this metric as well as to unpublished calculations. 
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B. COMPARISON OF RESULTS TO PREVIOUS WORK 

The results of this thesis generally support the findings of previous 
researchers in the area of relationships between faults. This agreement 
suggests that the relationships may not be specific to the CONFLICT 
software experiment but have a more general validity in large software 
applications. 

Myers (1979) postulates that functional clustering exists but provides 
no further evidence or explanation of the basis for it. This thesis, together 
with the companion work of Ginn (1990) makes a step toward identifying 
the specific behavior of faults that result in failure region clustering. 

The eminent role of control logic, case selection and control state (type 
C) faults in program testing has been always emphasized (Beiser 1990, 
Myers 1979), therefore the importance of taxonomical clustering cannot be 
easily overlooked. 


Briliant et_al (1990), analyzing the faults in a 27-version software 





experiment, conclude that the faults across independent versions not only are 


not independent but the interdependence is, in many cases, more pronounced 


49 


among logically related (which includes taxonomically and functionally 
related) faults. 
Further study of the fault relations on the same version or across 


versions is required for establishment of the interaction mechanisms. 


C. SUGGESTIONS FOR FURTHER RESEARCH 

While the results of this work are promising, the expermental 
population was small and narrowly focused. Additionally, the programs were 
written by students. Both the method and the results should be validated 
using a broad range of professionally-produced applications. 

One weakness of the method used in this thesis is that in testing the 
taxonomical clustering hypothesis, a small number of faults (about 10%) 
cannot be classified and therefore they are not included in the analysis. This 
is expected to introduce some minor inaccuracies in the results. 

Another weakness is the use of seven functional groups of modules in 
the functional clustering analysis, Movement, Observation, Attrition, 
Communication, Environment, Restoration and Others. This coarse functional 
decomposition, imposed by the necessity of a common base for comparison 


for all eight versions of CONFLICT, blurs the small scale clustering. In 
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addition, the number of lines per module is quite a weak metric for a 
procedure complexity and cannot be used effectively as an oracle to single 
out the most fault-prone modules. However, this simple metric indicates that 
it is possible to use some normal metric as a predictor of the complexity of 
a program unit. Further research is required to develop metrics that can serve 
as oracles for fault prone modules. 

Finally, the Jaccard coefficient, based on the number of shared 
dimensions bounding two failure regions, is not a very efficient measure of 
structural clustering. A better way to establish a metric of two-fault 
correlation might be the use of the number of paths through both their 
respective locations. This method was not used in this thesis since we have 
been focused on taxonomical and functional rather than structural clustering. 
However, once certain ambiguities, such as distributed faults, or faults 
without a specific location, such as a missing function etc., are resolved the 
proposed metric will directly translate the relationship between two failure 


regions into two Sets of code locations. 
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APPENDIX A: NOTATION AND SYMBOLS 
D;. Partition of the input domain, by path j 


Pg): Boolean expression equal to the conjunction of all branch 
predicate constraints along path j. Therefore all elements of D, satisfy 


P(j). 
Bound(G): The boolean expression which defines failure region G. 
¥ Symbol for exclusive or of boolean expressions. 


S;,Sj2..S;,(D'): The mapping of a subset, D’, of the input domain to the 
output range, by a sequence, S,,S,...5,,, of program statements (this 
sequence is called the path product), where j is a path. Condition 
evaluations are retained in the path products for consistency. In fact, the 
path products of the two segments starting from a decision point (Such 
as an if C then else construct or a loop exit condition etc) are preceded 
with the condition C and condition -C respectively, depending upon 
which part of the decision (if or else) results in their execution. 


S,1S;2( S;S;141-Sjtum) --Sjn : A Sequence of program statements, when a 
subset of them, S;)S;3.1--Sjj4m> 1S the body of a loop. 


S; S520 SiS} 141-Sjtem ) Sin A sequence of program statements, when 
a subset of them, S,,S,1,;-.Sjj4m> 18 the body of a loop, which executes at 


least once. 


S(G;,G,): The simple matching coefficient for failure regions G; and G;. 
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Bon +411 





5(G,G) = 


where a,, Stands for the non-bounding dimensions of both failure 
regions a,, for the composite dimensions of G, and G; and a for the 
number of all input dimensions. 


J(G;G; ): The Jaccard coefficient for failure regions Gi and Gj. 





HG,G,) = —! 
nL a-ag 


where a, Stands for the non-bounding dimensions of both failure 
regions a,, for the composite dimensions of G, and G, and a for all 
input dimensions. 


A—B or, equivalently, BA : Logical implication, A implies B ( or 
AAVB ). 


Identically participating dimensions: The dimensions that appear in 
both boundaries of a pair of failure regions in exactly the same way. 


Coincidentally Participating Dimensions: The dimensions that appear 
in both boundaries of a pair of failure regions but not in an identical 
way. 


Nonbounding Dimensions: The dimensions that, for a pair of failure 
regions, do not appear in the boundaries of both regions. The boun- 
daries these dimensions place on these failure regions are no more 
restrictive than their entire range of values. 


Composite Dimensions: Collective name for both Identical and 
Coincidental dimensions 


a3 


¢ Chi-Square Statistic (y’): Statistic used to test whether a significant 
difference exists between an observed and an expected number of 
objects® (the expected number of objects results from an assumed 
distribution of objects into categories). The greater the chi-square 
Statistic the lower the confidence that the sample data follow the 
assumed distribution. If there are k categories of objects, O, is the 
number of observed and E, the number of expected objects in category 
j, then the statistic has k-1 degrees of freedom and is equal to: 


k 2 
(O,-E,) 
> t 
jt Sy 


§ Siegel and Castellan 1988 
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APPENDIX B: TAXONOMICAL FAULT TYPE DISTRIBUTION 

In this Appendix we present histograms (Figures B1 to B8, in light 
grey) of the relative frequency of occurrence of fault types in the eight 
versions of CONFLICT (Shimeall 1991). The fault types in the histograms 
are LOOP, CONTROL, PROCESS and EXCEPTION HANDLING. They 
correspond to the faults of type A, B, C and D of Chapter III (In fact A, B, 
C, D are just abbreviations). 

The fault type frequency histograms are compared with the expected 
frequency histograms of a simple model which assumes uniform distribution 
of faults (histograms ad dark grey). In figure B9 we present an overall 
comparison of the observed to expected number of faults ratio, for all eight 
versions. There is a very sharp peak of the ratio distributions in the 
CONTROL type of faults, which implies that this type of fault has a 
frequency of occurrence much greater than the expected by the assumed 


model. 
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I (light grey) versus 


random distribution of faults to program locations (dark grey). 


° 


sl10on 


in Ver 


° 


Taxonomical Fault Frequency 


Figure B1 





Figure B.2: Taxonomical Fault Frequency in Version II (light grey) 
versus random distribution of faults to program locations (dark grey). 





Figure B.3: Taxonomical Fault Frequency in Version III (light grey) 
versus random distribution of faults to program locations (dark grey). 





Figure B.4: Taxonomical Fault Frequency in Version IV (light grey) 
versus random distribution of faults to program locations (dark grey). 





Figure B.5: Taxonomical Fault Frequency in Version V (light grey) 
versus random distribution of faults to program locations (dark grey). 
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Figure B.6: Taxonomical Fault Frequency in Version VI (light grey) 
versus random distribution of faults to program locations (dark grey). 





Figure B.7: Taxonomical Fault Frequency in Version VII (light grey) 
versus random distribution of faults to program locations (dark grey). 





Figure B.8: Taxonomical Fault Frequency in Version VIII (light grey) 
versus random distribution of faults to program locations (dark grey). 





Figure B.9: Actual number of taxonomical faults to expected number of 
faults ratio for the eight versions of the program. 
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APPENDIX C: SHARED BOUNDING DIMENSIONS IN TAXONOMI- 
CALLY CLUSTERED FAILURE REGIONS 

In this Appendix we present a detailed analysis, whether taxonomical 
clustering of failure regions results in an increased number of shared” 
bounding dimensions. 

The results, are presented in the tables C1 to C8. Each of the tables, 
corresponds to one of the eight versions of the CONFLICT program 
(Shimeall 1991, 1991b). Each table entry, corresponds to a pair of failure 
regions of the same en It is noted that the numbering of the failure 
regions is not significant in this analysis. It merely represents the order in 
which the faults and the corresponding failure regions were discovered. 

Table entries on the main diagonal, contain the logical cluster identifier 
(A to D), for the corresponding failure region (cf. Chapter III). Each of the 


off diagonal entries contains the Jaccard coefficient for the identical 


: In this analysis, shared dimensions are the ones that correspond to the same predi- 


cates, identical. 


6] 


dimensions of the two failure regions labeling the row and column, and the 
fault type identifier, in case the regions belong to the same logical cluster. 

For example, entry (1.2,1.5) contains 0.100./B which means that the 
Jaccard Coefficient for these failure regions is equal to 0.100., and that both 
correspond to faults of type B. On the other hand, entry (1.12,1.5) contains 
0.154, so the two failure regions have a Jaccard coefficient equal to 0.154, 
but correspond to different types of faults (D and B respectively, in this 


case). 
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TABLE C.1 : 
JACCARD COEFFICIENTS OF FAILURE REGIONS OF VERSION I 
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TABLE C.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 





TABLE C.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS 
OF VERSION I 
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TABLE C.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS 
OF VERSION I 
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TABLE C.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS 
OF VERSION I 









_ | 
= | 
oe | 

= 

poms 

~~] 

= 

=. 

o> | 

— 

poe, 

~_q 






__— _ ___——————— — 





Sa gaan 0.067 0.048 
pos | to 080 0.00N\C 0.050 







0.0SAC 0.421 


TABLE C.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS 
OF VERSION I 


0.048\C 


oe [ane 





66 


TABLE C.2 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 


p21 | 0 | oo | oor | 0083 | 0063 | D> 
p21 | oom | om | om | o | o | o 
= Se 
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APPENDIX D: FUNCTIONAL FAULT TYPE DISTRIBUTION 

In this Appendix we present (Tables D.1 to D.8) the relative frequency 
of occurrence of functional fault types in the eight versions of CONFLICT 
(Shimeall 1991b). The fault types are classified in types O, I, i, I, IV, V 
and VI according to the discussion in Section B.7 of Chapter II . 

Some of the faults included in the failure region library by Shimeall 
(1991a) do not have a specific location and, therefore, cannot be assigned to 
any specific program module. However, it is possible by their description to 
determine which part of the CONFLICT function each of them affects. These 
faults occupy the alli entries (labeled as Others, movement,.. etc) in bold 
type that precede each functional group of program modules. 

Faults that have been assigned to multiple functional groups contribute 
to the fault count of each group by an appropriate fraction of fault. For 
example, a fault distributed in three groups, will contribute to the fault count 
of each group by 1/3. 

The observed fault type frequencies are compared with the expected 
frequencies of a simple model that assumes uniform distribution of faults as 


in Appendix B. 
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0 The fault rate is 26/2352 = 0.01105 faults per line (including data 
declarations in the line count) 
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"The fault rate is 26/1540 = 0.0169 faults per line (including data 
declarations in the line count) 
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12 The fault rate is 40/1200 = 0.00333 faults per line (including data 
declarations in the line count) 
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'S The fault rate is 23/1995=0.01153 faults per line (including data 
declarations in the line count) 
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* The fault rate is 40/1544 =0.026 faults per line (including data 
declarations in the line count) 
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TABLE D.5 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION V 


Procedure Type Lines Observed Relative eee Relative 
Faults Frequency Faults Frequency 
ae psssie | - feLe 
_ = EE 2S Ps See 
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aap eS 
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TABLE D.6 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 


VERSION VI° 


Procedure Type Lines Relative 

Faults Frequency 
cy 

[ome [0 | wm |_| aso | seu | ome | 


3.686 
| conti, | o | 5 | | - | - ae 
| min =| o | os |} | -) | 7 


Max 


a 


IMin 
IMax 


Ceiling 


CheckBatt 
Constaris 


Init Battalion 


Initialize 
Perfom 
Simulation 


Perform OneDt | 0 | 12 | - 
| O | 12 6.7 | 


Prepare For 
NextDt 


Determine 37 
Output 





© The fault ra. is 20/2219=0.0090 faults per line (including data 
declarations ::i the line count) 
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TABLE D.6 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VI 


| Procedure Type Lines Observed Relative 
Faults Frequency 
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TABLE D.6 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VI 


Procedure Type Lines Observed Relative Expected Relative 
Faults Frequency Faults Frequency 

_attion | im | 200 | 62 570 | _ 1.812 
Attrition | om | 42 


Num 27 
Weapons 
Track 29 6.5 
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UpdatUse Hl 29 
List 
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Targets 
Suffer 67 
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Communic IV 6.24 7.5/20 8.112 0.406 
ations 
Include IV 39 
Comm Obs 


Collect 
Finished IV 41 
UpdatetL | wv | 42 | om |= 
Communic IV 16 
ation 
Send 
Communic IV 35 
ations | 
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TABLE D.6 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VI 


Procedure Type Lines Observed Relative 
Faults Frequency 
NewNum 
Send 
Send 
Command 
ReceiveCom- 
munications IV 11 
Find Receiv- IV 55 6.12 
ing Delay 
Receive IV 56 
Reports 
Receive IV 57 
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Update we fT 30 
Num Vars 
ProcessCom- IV 11 
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Handle IV 
Queuing 
Queue IV 38 
Reports 
Find Queue 
Report 


Find Queue IV 19 
| Spot 
| Processing IV 24 
| Delay 
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TABLE D.6 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VI 


tee me —E am te A fT ———=, 
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TABLE D.6 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VI 


Destroyed 
Squads 
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TABLE D.7 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 


VERSION vil'’® ” 
Procedure Type Lines Observed Relative Expected | Relative | 
Faults Frequenc Faults Frequen 
y cy ! 
| conice | oo | 6 | || 


io | os | | ass | 1038s | 0335 
| fot [| o | 4 | - | - 7] -) ae 
|_Wrieedor | o | 2 | - | “| 7) [a 
[Process | o | io | - | - | .- ee 
| Check Params | o | 3 | - | = | = | = 
| CheckNamy | o | 3 | - | + || - ae 
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—— 







Update Info 
Update Battalion 


'6 The fault rate is 31/1800=0.0172 faults per line (including data 
declarations in the line count) 


In this Version, certain sub procedures are declared in more than one 
locations 
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TABLE D7. FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION Vil 


- Movement | , 23. 4. | 4977 
Invalid 

~ 7 

| Velocity 


a 


Position 
a 


| SetPosition | 1 


es oe ee 
(ee ee eee 


Update 
Position 
Jo eaters | 


| Velocity | 


—. ot 


aa 


TABLE D.7 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VII 
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TABLE D7. FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VII 


Procedure  Shusbene Lines Observed Relative Expected Relative 
Faults Frequency Faults Frequency | 
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TABLE D.7 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 


VERSION Vil 
Procedure Type Lines Observed Relative se Ye tee el Relative 
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TABLE D.8 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
_____ VERSION Vin" 


Procedure — Lines Observed Relative Expected Relative 
Faults Frequency Faults a) 


0/41 checalg 0.190 















a 


ee [oe [ae | | 
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| ees 
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The fault rate is 41/1366 =0.030 faults per line (including data 
declarations in the line count) 
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TABLE D.8 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION Vil 


Observed Relative Expected Relative 
Faults Frequency Faults Frequency 
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TABLE D.8 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 
VERSION VI 
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TABLE D.8 FAULT DISTRIBUTION TO FUNCTIONAL MODULES IN 


VERSION VIil 
Procedure Type 
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Faults Frequency Faults Frequency 
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APPENDIX E: SHARED BOUNDING DIMENSIONS IN FUNCTION- 
ALLY CLUSTERED FAILURE REGIONS 

In this Appendix we present a detailed analysis, whether functional 
clustering of failure regions results in an increased number of shared’’ 
bounding dimensions. 

The results, are presented in the tables El to E8. Each of the tables, 
corresponds to one of the eight versions of the CONFLICT program 
(Shimeall 1991, 1991b). Each table entry, corresponds to a pair of failure 
regions of the same version as in Appendix C. 

Table entries on the main diagonal, contain the functional cluster 
identifier (1, IH, HI, 1V, V, VI, O), for the corresponding failure region (cf. 
Chapter III). Each of the off diagonal entries contains the Jaccard coefficient 
for the identical dimensions of the two failure regions labeling the row and 
column, and the fault type identifier, in case the regions belong to the same 


functional cluster. 


'© In this analysis, shared dimensions are the ones that correspond to the same 


predicates, identical. 
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For example, entry (1.1, 1.6) contains 0/O which means that the Jaccard 
Coefficient for these failure regions is equal to 0, and that both correspond 
to faults of type O. On the other hand, entry (1.5, 1.12) contains 0.154, so 
the two failure regions have a Jaccard coefficient equal to 0.154, but 


correspond to different types of faults (VI and V respectively, in this case). 
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TABLE E.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 
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TABLE E.1 ;: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 





TABLE E.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 
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TABLE E.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 
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TABLE E.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 





TABLE E.1 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 


0.357 0.053 





140 


TABLE E.2 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 


ge a Ea 
25 | ov | oosmv | osssav | oussav | ow | 
_ 0 | a0 | ooo: | oss | 0063 | - 


a a 
ecm | oo | fo | lo | om To 
| ai 





14] 


TABLE E.2 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 
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VERSION 
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TABLE E.2 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 
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VERSION I 
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TABLE E.2: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION I 
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TABLE E.3 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION Il 
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TABLE E.3 ; JACCARD COEFFICIENTS OF FAILURE REGIONS OF 


VERSION Ii 
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TABLE E.3 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION Il 
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TABLE E.3 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION ue 
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TABLE E.3: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION Il 
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TABLE E.3 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION If 
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TABLE E.3 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION Il 
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JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
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TABLE E.4 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION IV 
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TABLE E.5: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
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TABLE E.5: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
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TABLE E.5: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION V 
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TABLE E.5: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION V 
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TABLE E.6 : JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
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TABLE E.7: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
VERSION VII 
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TABLE E.7: JACCARD COEFFICIENTS OF FAILURE REGIONS OF 
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